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Abstract 

Background: The nonautonomous maize Ds transposons can only move in the presence of the autonomous 
element /\c. They comprise a heterogeneous group that share 11-bp terminal inverted repeats (TIRs) and some 
subterminal repeats, but vary greatly in size and composition. Three classes of Ds elements can cause mutations: 
Ds-del, internal deletions of the 4.6-kb Ac element; Dsl, ~400-bp in size and sharing little homology with Ac, and 
Ds2, variably-sized elements containing about 0.5 kb from the Ac termini and unrelated internal sequences. Here, 
we analyze the entire complement of Ds-related sequences in the genome of the inbred B73 and ask whether 
additional classes of Ds-like {Ds-l) elements, not uncovered genetically, are mobilized by Ac We also compare the 
makeup of Ds-related sequences in two maize inbreds of different origin. 

Results: We found 903 elements with 1 1-bp Ac/Ds TIRs flanked by 8-bp target site duplications. Three resemble Ac, 
but carry small rearrangements. The others are much shorter, once extraneous insertions are removed. There are 
331 Dsl and 39 Ds2 elements, many of which are likely mobilized by Ac, and two novel classes of Ds-l elements. 
Ds-13 elements lack subterminal homology with Ac, but carry transposase gene fragments, and represent decaying 
Ac elements. There are 44 such elements in B73. Ds-I4 elements share little similarity with Ac outside of the 1 1-bp 
TIR, have a modal length of ~1 kb, and carry filler DNA which, in a few cases, could be matched to gene 
fragments. Most Ds-related elements in B73 (486/903) fall in this class. None of the Ds-l elements tested responded 
to Ac Only half of Ds insertion sites examined are shared between the inbreds B73 and W22. 

Conclusions: The majority of Ds-related sequences in maize correspond to Ds-l elements that do not transpose in 
the presence of Ac Unlike actively transposing elements, many Ds-l elements are inserted in repetitive DNA, where 
they probably become methylated and begin to decay. The filler DNA present in most elements is occasionally 
captured from genes, a rare feature in transposons of the MT superfamily to which Ds belongs. Maize inbreds of 
different origin are highly polymorphic in their DNA transposon makeup. 



Background 

The first transposable element discovered by McClin- 
tock [1] was Ds {Dissociation). Ds could break chromo- 
somes at its site of insertion and could move in 
response to another factor, which she named Ac {Acti- 
vator) and showed to be self-mobilizing or autonomous 
[2]. Shortly after this discovery, McClintock established 
that there were two types of Ac-responsive Ds elements: 
those that caused chromosome breaks at high frequency 
and those that did not. She named the former state I Ds 
and the latter, state II Ds [3]. Thus, early on, it became 
clear that Ds elements could differ genetically. Yet, a 
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common origin for these elements was suggested by the 
observation that state I elements could change to state 
II. How heterogeneous Ds elements are only became 
clear after Ac and Ds were isolated and characterized 
molecularly [4-6]. 

The 4.6-kb autonomous Ac element makes a single 
3.8-kb transcript that spans most of the element's length 
and encodes an 807-amino acid putative transposase [7]. 
It causes an 8-bp target site duplication (TSD), ends in 
11-bp terminal inverted repeats (TIRs), and contains, 
within the terminal 200 bp at either end, multiple copies 
of a hexameric repeat to which the Ac transposase binds 
[8]. Ds elements share TIRs with Ac and also cause 8-bp 
duplications of the target site. Three very different kinds 
of Ds elements can transpose and cause mutations in 
response to Ac. Ds-del elements are simple internal 
deletion derivatives of Ac found predominantly in 
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genetic stocks that recently carried an active Ac [4,9,10]. 
Dsl elements are short (-400 bp) and share with Ac 
only the 11 bp TIRs and a few of the subterminal hex- 
americ repeats [11]. Ds2 elements are > 1 kb in length 
and share extensive sequence homology with Ac in the 
terminal 200 bp at either end [12]. In many Ds2 ele- 
ments, a large part of Acs internal sequence has been 
replaced with an unrelated or "filler" sequence. The abil- 
ity of Ds elements to cause chromosome breaks turned 
out to be a function of their structure: all chromosome 
breakers have multiple transposon ends [5,13,14]. 

Ac and genetically defined Ds elements transpose pre- 
ferentially into unique or low-copy sequences and lar- 
gely avoid the repetitive DNA that makes up the bulk of 
the maize genome [15,16]. Therefore, these elements are 
highly efficient insertion mutagens. Ac is absent from 
most maize lines or populations and present in usually 
one copy in lines with Ac activity. In contrast, many Ds- 
hybridizing sequences are present in the genomes of all 
lines examined [17], but there has not been yet a con- 
certed effort to characterize all the sequences related to 
Ds in the maize genome. To that end, we created a 
heuristic searching algorithm based on the sequence of 
the AclDs TIRs and the size of the TSD and ran it 
through the maize pseudomolecules. Each £)5-related 
element identified via this search was individually num- 
bered, annotated, and had its insertion site categorized 
according to uniqueness and content in the B73 maize 
genome sequence [18]. 

A total of 903 elements with Ds sequence features 
were located within the B73 genome. A minority 
resembled previously described elements: there are 331 
Dsl and 39 Ds2 elements, most of which are probably 
mobilized by Ac, In addition, two new classes of Ds-like 
(Ds-l) elements were identified. Ds-13 elements lack 
extensive subterminal homology with Ac, but carry frag- 
ments from various parts of the Ac transposase gene. 
There are 44 such elements in B73. Ds-14 elements 
share little similarity with Ac outside of the 11-bp TIR 
and have a modal length of ~1 kb. The majority of Ds- 
related elements in B73 (486/903) fall in this class. 
None of the Ds-l elements tested were excised by Ac. 
Unlike Ac and recently transposed Ds elements, about 
half of the D5-related elements identified in this study 
had inserted in repetitive DNA. Conversely, repetitive 
sequences, such as long terminal repeat (LTR) retrotran- 
sposons, were found within some Ds-l elements. Some 
elements carried gene fragments, a rare feature of trans- 
posons of the hAT superfamily to which Ds belongs, 
though a common feature of other transposons. Lastly, 
only half of Ds insertion sites examined were shared 
between the inbreds B73 and W22, indicating that the 
makeup of DNA transposons will vary greatly among 
inbreds of unrelated origin. 



Methods 

Development and implementation of the Ds discovery 
algorithm 

The 11-bp terminal inverted repeats (TIR) are specific 
sequences that define the 5' end (C/TAGGGATGAAA) 
and 3' end (TTTCATCCCTA) of each Ac/Ds element 
and play a key role in transposition [19-22]. The other 
identifying trait of an Ac/Ds element is the target site 
duplication (TSD). Not part of the transposon, the TSD 
is a direct repeat of the same 8 base pairs upstream and 
downstream of the TIR. Unlike the 11-bp TIR, the 8-bp 
TSD is not a specific sequence. Rather, because it is the 
site where the Ds element inserted into the genome, the 
TSD can be almost any combination of 8 bp. Based on 
the sequence characteristics of Ac and Ds elements, we 
developed a data mining algorithm, written in PERL, to 
search through the maize pseudomolecules for Ds 
sequences and found sequences with perfect TIRs and 
identical TSD sequences and with TIRs and TSD 
sequences differing by up to 2 bp. The algorithm gener- 
ated putative Ds elements and identified the position of 
their TSD in the genome. Putative Ds elements were 
then used to BLAST-search the maize genome database 
for related Ds sequences in which the TIRs or TSDs dif- 
fered by more than 2 bp. 

Because known Dsl elements are smaller than 500 bp 
and Ds2 elements are larger than 1 kb, all elements 
were first separated by length into groups of elements 
measuring < 500 bp, 500 to 1000 bp, 1000 to 5000 bp, 
and > 5000 bp. Then, elements were located in the 
maize genome via BLAST, and redundant sequences 
were eliminated. Each sequence was then compared to 
known Dsl and Ds2 elements using BLAST2, and cate- 
gorized based on those results. The nature of the target 
sites was determined by a series of tests, including pre- 
sence of start and stop codons, presence of untranslated 
region sequences, EST (expressed sequence tag) support, 
and presence of corresponding mRNA in GenBank. 

Prior annotation of the predicted Ds elements and of 
the genome sequences 200 bp upstream and down- 
stream from the element's insertion site were checked in 
the database at maizesequence.org. The 200 bp flanking 
each element were analyzed for uniqueness using 
BLAST, maizesequence.org, and RepeatMasker http:// 
www.repeatmasker.org/. The 200-bp subterminal regions 
within each sequence were examined for the AAACGG 
6-bp hexamer repeats which bind the Ac transposase 
[8]. Only three copies of these hexamers in the 5' end 
have been shown to be required for transposition [23]. 

Assay for Ds mobility 

The B73 stock used in the Ds mobility assay was 
obtained from the USDA North Central Region Plant 
Introduction Station at Ames, lA (PI-550473). The W22 
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stocks used were: the standard Wx version and a version 
carrying wx-m7(Ac), an unstable wx allele described by 
McClintock [24], This allele arose by insertion of the 
4.6-kb Ac element in the 5' untranslated region of the 
Wx gene [20,25]. Activation of Ds excision by Ac was 
monitored by PGR in an Fl between B73 and the W22 
stock carrying wx-m7(Ac) [15]. DNA was extracted from 
young seedling leaves and PGR reactions were run as 
described [26]. PGR primers were designed based on the 
500 bp sequences flanking each Ds element end. All pri- 
mer sequences are listed in Additional File 1, Table SI. 

Analysis of Ds elements in W22 

Physically sheared W22 DNA was size-fractionated by 
agarose gel electrophoresis to a 300- to 500-bp size 
range. Sheared DNA ends were blunt-ended and kinased 
with an Epicentre End-It DNA end repairing kit and 
adaptors were ligated with T4 DNA ligase. PGR amplifi- 
cations were carried out with an adaptor primer and 
primers ending in the Ac/Ds TIRs. PGR products were 
cloned into pGEM-T easy vector and sequenced in a 
3730 DNA analyzer. 

Results 

Structure of Ds-related elements 

Dsl elements are short, less than 500 bp in length, and 
share little sequence in common with Ac, other than the 



11 -bp TIRs and a few copies of the AAAGGG hexame- 
ric repeat found in the Ac subterminal regions (Figure 
1). They were first detected in unstable mutations 
[6,27,28] and are known to occur in at least 50 copies in 
the genomes of all maize lines [29]. In contrast, Ds2 ele- 
ments are much closer to Ac, sharing with Ac the -200 
bp subterminal regions (STR) and variable stretches of 
internal sequence, and appear to have originated from 
Ac by deletion (Figure 1). Most carry, in addition, "filler" 
sequence from other parts of the genome [30,31]. They 
have been estimated to occur in about a dozen copies 
and, like Dsl, have been found in several unstable 
mutations. 

In addition to the previously described Dsl and Ds2 
element classes, we have identified a large number of 
elements that, though containing the hallmark Ac/Ds 
TIRs and flanking an 8-bp duplication, do not match 
any previously described Ds sequences. They lack 
obvious homology to Ac in the terminal 200 bp 
sequences and may not be mobilized by Ac, so we have 
referred to them as Ds-like (Ds-l) elements. Ds-l ele- 
ments can be divided into two groups, Ds-13 and Ds-14, 
Ds-13 elements have an average length of 4 kb and 
retain parts of exons 2 and 3 of the Ac transposase (Fig- 
ure 1). Ds-14 elements are on average about 1 kb long 
and, though very numerous and variable in sequence, do 
not differ much in size. The homology between Ds-14 
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Figure 1 Structure of different types of Ds elements compared with Ac. Ac is 4565-bp long and encodes a 5-exon transposase. Ds/ 
elements are the shortest and share little in common with Ac. Ds2 elements have -200 bp of the Ac subterminal region at each end. Ds-13 
elements carry sequences corresponding to parts of exons 2 and 3 of the Ac transposase and are the longest Ds elements, on average. Ds-14 
elements have a modal length of about 1 kb and share with Ac only about 30 bp at either end. 
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and Ac is restricted to little more than 30 bp on the 5' 
and 3' ends (Figure 1). 

Content of Ds-related elements in B73 

The chromosomal distribution of all 903 D5-related ele- 
ments in B73 is shown in Table 1. There are 331 Dsl, 
39 Ds2, 44 Ds-13, and 486 Ds-14 elements in the B73 
genome (Additional File 1, Tables S2-S5). A test 
showed no significant difference between the observed 
distribution of D5-related elements per chromosome and 
the expected distribution based on chromosome length 
in the 10 chromosomes of B73 (Figure 2). 

Three elements, classified here as Ac-like elements, 
share a large amount of similarity with Ac, save for 
small indels. A similar element called Ac-cryptic has 
been described previously [32]. As anticipated from its 
history [33], the B73 genome contains no simple dele- 
tion derivatives of Ac, More than 90% of Ds-related ele- 
ments in the genome fall either in the Dsl or Ds-14 
classes. Dsl elements were known to be numerous 
based on earlier hybridization data [11], but their copy 
number had been estimated to be around 50, so the pre- 
sent analysis of the maize genome sequence pushes up 
their number at least sixfold. Of the 331 Dsl elements 
in B73, 184 were identified as such by the maize 
sequencing effort (maizesequence.org), 4 were incomple- 
tely annotated, and 143 were missed (Table 2). All 39 
Ds2 elements identified here had variable filler 
sequences unrelated to Ac, Of the 39, only 2 were fully 
annotated as Ds sequences in the maizesequence.org 
database, 26 were partially annotated, and 11 were 
missed. We searched the B73 genome for the specific 
Ds2 elements previously isolated from unstable muta- 
tions and found that the element in wx-B4 [34] was pre- 
sent in two copies {Ds2-2 and Ds2-lS in Additional File 
1, Table S3), but those in adhl-2Fll [12] or sh2-ml 
[30] were absent. 



Table 1 Chromosomal distribution of Ds elements in B73 
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None of the Ds-l elements was annotated as a full Ds 
element in maizesequence.org. Among the 44 Ds-13 ele- 
ments, 42 were partially annotated as Ds and 2 are new. 
Notably, of the 486 Ds-14 elements, only 62 had been 
partially annotated as Ds elements, while the remaining 
424 had been missed. Despite their dissimilarities with 
the Ac sequence, the Ds-14 elements themselves are 
related to each other. A Jalview phylogram [35,36] 
shows that Ds-14 elements fall into 3 main clusters, 
after excluding from the alignment the 30 elements that 
are larger than 5 kb (Additional File 2, Figure SI). The 
Ds-14 elements in each cluster have a high degree of 
similarity throughout. Cluster 1 (top part of the phylo- 
genetic tree) contains 57 elements, cluster 2 (middle of 
the phylogenetic tree) contains 339 elements, and clus- 
ter 3 (bottom of the phylogenetic tree) contains 60 
elements. 

Nature of Ds insertion sites 

Dsl and Ds2 sequences have been shown to cause 
mutations when they insert into genes 
[6,12,30,31,37-39], so it is not surprising that several Ds 
and Ds-l sequences were found in or near genes. As 
shown in Table 3, 37 Dsl (11%), 5 Ds2 (13%), 2 Ds-13 
(5%), and 34 Ds-14 (7%) elements had inserted in or 
within 200 bp of a gene model. Given the propensity for 
Ac and Ds to insert into or near genes, it was surprising 
that relatively few Ds elements mapped into or near 
genes in B73. It would appear, then, that somewhat 
lower percentages of Ds-l than Ds elements are found 
near genes. Similarly, slightly higher percentages of Ds-l 
elements are inserted in repetitive or intergenic low- 
copy DNA (49.8% vs. 41.5%, respectively). Overall higher 
percentages of Ds-related elements are found in repeti- 
tive DNA sequences than would have been expected 
from prior genetic analyses of recently transposed Ac or 
Ds elements in maize [15,16]. 

Ds and Ds-like elements carrying gene fragments 

Although transposons from other superfamilies, such as 
MULEs [40], CACTA [41], and Helitrons [42,43], can 
take up gene fragments, this property has not been 
reported for members of the hA T superfamily, to which 
Ds belongs [7]. In this study, we have identified both 
Ds2 and Ds-14 elements that carry gene fragments (Fig- 
ure 3). 

Two Ds2 elements appear to have captured gene frag- 
ments. Ds2-22 carries exons 5 and 6 and the intervening 
intron from an lAA amino acid hydrolase ILRl-likeS 
gene. This gene fragment appears to have been copied 
from its original location on chromosome 2 onto Ds2- 
22, which resides on chromosome 9 of B73 (Figure 4). 
The sequence carried in the Ds2-18 element is anno- 
tated as a complete gene encoding hypothetical protein 
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Figure 2 Distribution of Ds elements in the 10 chromosomes of B73. The observed distributions are not significantly different from tliose 
expected based on tine lengtli of tine 10 cliromosomes (x^ = 9.05, 9 df, 0.5 > P > 0.3). 



LOC100383699. This annotation is supported by EST 
and mRNA evidence. However, all copies of this coding 
sequence are transposon-borne in B73, raising doubts as 
to the functional significance of the transcript in the 
sequence database. 

Similarly, some Ds-14 elements appear to have captured 
gene fragments. Ds-l4-354, which resides on chromo- 
some 8 of B73, carries a fragment of the ZmMybstl gene, 
which is located on chromosome 9 (Figure 4). The 7344- 
bp Ds-l4-483 element on chromosome 10 is an example 
of a compound Ds-14 element that has trapped extra- 
neous sequences between its termini (Figure 5). Bases 1- 
987 of Ds-l4-483 correspond to a regular Ds-14 element 
with perfect TIRs flanked by a TSD. Directly following 
this are: a fragment of an LTR-retrotransposon 



polyprotein (988 to 1650), a transcribed sequence from 
chromosome 5 (1651 to 6537), and a duplicate, but trun- 
cated, copy of the first Ds44 in this compound element, 
ending in a perfect TIR and an additional copy of the 
TSD (6538 to 7344). 

Two additional sets of transcribed sequences, found in 
the GenBank mRNA database, are carried by repetitive 
Ds-14 elements. Ds-l4-93, Ds-l4-13S, Ds-l4-307, and Ds- 
14-343 contain the two putative exons of Zea mays 
clone 1443778 mRNA and Ds-l4-244, Ds-l4-275, and 
Ds-l4-368 contain the two putative exons of Zea mays 
clone 1451145 mRNA. The introns separating these 
exons in the elements end in canonical GT-AG. Neither 
mRNA encodes a protein with a homolog in the 
sequence databases, so again, it is conceivable that the 



Table 2 New and previously identified Ds elements in 
B73 

Previously Identified 
Element New Partially annotated Fully annotated Total 



Dsl 143 4 184 331 

Ds2 11 26 2 39 

DS-/3 2 42 0 44 

Ds-14 424 62 0 486 

Ac-like 0 3 0 3 

Total 580 137 186 903 



Table 3 Nature of sequences adjacent to Ds insertions in 
B73 



Element 


Genes (in or within 
200 bp) 


Repetitive 
DNA 


Intergenic 
DNA 


Total 
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21 
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25 
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Ds-14 


34 
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78 


375 


450 


903 
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Figure 3 DNA sequences inserted witliin Ds elements. The sequences can originate from genes, LTR retrotransposons, other DNA 
transposons, or intergenic DNA. 



Donor gene lAA-amino acid hydrolase 
ILRl-like 3 from chromosome 2 




Ds2-22 5' TAGGGATGAAA 
(1121 bp) 



TTTCATCCCTA 3' 



Dsl4-354 5' TAGGGATGAAA 

(1510 bp) 




TTTCATCCCTA 3' 



Donor gene ZmMybstl from chromosome 9 

Figure 4 Genie DNA capture by Ds elements. Ds2-22 and Ds-l4-354 are located on chromosomes 9 and 8, and their inserted gene fragments, 
containing intron sequences, are from chromosome 2 and 9, respectively. Exons are diagrammed in green or orange, introns in yellow, and Ds 
transposon sequences in blue. 
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only copies of the transcribed sequences in the genome 
are those carried in the transposons. Nevertheless, the 
presence in different chromosomes of multiple copies of 
two sets of two Ds-14 elements with almost identical 
transcribed sequences suggests that these elements 
transposed in the recent past, after acquisition of the 
sequences. 

Nongenic insertions in Ds element 

Large insertions were found in all Ds element classes 
(Figure 3). Not surprisingly, the majority were LTR ret- 
rotransposons, which constitute the bulk of the maize 
genome [44]. Ten Dsl elements were much longer than 
average, measuring between 0.8 and 24 kb. LTR retro- 
transposons were inserted in six of them (Additional 
File 1, Table S2). Although the average length of Ds2 
elements was approximately 1.4 kb, some, such as Ds2- 
13, were as long as 12.7 kb as a result of LTR retrotran- 
sposon insertions (Additional File 1, Table S3). Similarly, 
LTR retrotransposons were inserted in 7 Ds-13 and 35 
Ds-14 elements (Additional File 1, Tables S4 and S5). 
Though less numerous, DNA transposons were also 
found inside of Ds elements. For example, a Ds-14 ele- 
ment was nested inside of Ds-13-15, This nesting pattern 
is common for LTR retrotransposons [45], but rarer for 
Ds transposon types, probably because nested Ds ele- 
ments acquire chromosome-breaking properties [46] 
and would be selected against. In agreement, no chro- 
mosome-breaking double Ds [5] was detected in B73. In 
addition to transposons, intergenic sequences without 
obvious transposons properties could be found inside of 



Ds elements. These most likely represent DNA capture 
events that do not include coding sequences. Intergenic 
DNA sequences of various lengths were found in 4 Dsl, 
1 Ds-13, and 14 Ds-14 elements (Additional File 1, Tables 
SI, S3, and S4). 

Mobility of Ds and Ds-l elements 

The terminal 200 bp at either end of Ac are required for 
wild- type levels of transposition [47]. Within these term- 
inal sequences, there are multiple AAACGG repeats that 
bind to the Ac transposase in a cooperative fashion and 
are, most likely, the subterminal sequences that impart 
specificity to the transposition reaction [8,22,23]. Only 
three copies of these hexamers have been shown to be 
required for transposition [23]. The number of B73 Ds- 
related elements with 3 or more copies of the subterm- 
inal repeat in each class are: 17 of 331 Dsl, all 39 Ds2, 
37 of 44 Ds-13, and 438 of 486 Ds-14 (Table 4). Thus, 
the vast majority of Ds2, Ds-13, and Ds-14 elements con- 
tain 3 or more AAACGG repeats within their subterm- 
inal regions, but only a few Dsl elements do. However, 
the lack of these hexamers does not preclude transposa- 
bility, as some Dsl sequences lack them and are still 
able to transpose, probably because the transposase can 
interact with sequences related to, though not identical 
with, the hexameric repeat [8]. 

In order to test the mobility of the newly identified 
Ds-13, Ds-14, and control Ds2 elements in response to 
Ac, we developed the following PGR excision assay. 
We first identified Ds-13, Ds-14, and Ds2 elements in 
B73 that had inserted in single-copy DNA and were 
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Table 4 Ds- and Ds-like sequence with 3 or more copies 
of the hexamer sequence AAACGG within their 
subterminal regions 



Chromosome 


Ds7 


Ds2 


Ds-13 


Ds-14 


Ac-Like 


Totals 


1 


3 


7 


3 


76 


1 


90 


2 


0 


5 


3 


42 


] 


51 


3 


0 


2 


6 


48 


0 


56 


4 


4 


2 


5 


63 


0 


74 


5 


5 


4 


3 


42 


1 


55 


6 


0 


0 


3 


30 


0 


33 


7 


2 


3 


5 


37 


0 


47 


8 


1 


6 


5 


38 


0 


50 


9 


1 


2 


0 


24 


0 


27 


10 


0 


4 


3 


32 


0 


39 


Unknown 


1 


4 


1 


6 


0 


12 


Totals 


17 


39 


37 


438 


3 


534 



shared with the inbred W22. We then crossed B73 
with the W22 genetic stock c wx-m7(Ac) and moni- 
tored filial and parental DNAs for the presence of a 
somatic Ds excision or "empty" site in the Fl, but not 
in B73. Figure 6 shows the PGR results for representa- 
tive elements. Ds2-21 shows a clear excision band in 
the Fl, whereas Ds2-31 does not (Figure 6A), indicat- 
ing that the former can transpose in response to Ac, 
but the latter cannot. An alignment of the 5' and 3' 
ends of Ds2-21 and Ds2-31 with Ac (Figure 7) reveals 
that Ds2-31 has undergone a duplication-deletion at 
the 3' end, which probably interferes with its mobility. 
On the other hand, none of the shared Ds-l elements 
tested {Ds-13-28, Ds-l4-118, Ds-14-169, Ds-14-199, Ds- 



14-266, Ds-14-337, and Ds-l4-378) produced an excision 
band in the Fl (Figure 6B and data not shown). Thus, 
in spite of possessing subterminal AAACGG hexamers, 
neither Ds-13 nor Ds-14 elements are able to transpose. 
Possibly, the 5' and 3' ends of Ds-13 and Ds-14 
elements, as well as of 17 of 39 Ds2 elements (align- 
ment data not shown), are too defective to be mobi- 
lized by Ac. 

Sampling the Ds Elements of W22 

The only complete maize genome sequence in the pub- 
lic domain is that of the B73 inbred. This genome is 
constantly being revised and recompiled, and serves as 
the de facto maize database for BLAST. The W22 inbred 
line is being utilized in genetic studies aimed at develop- 
ing AclDs transposon tagging resources [15,16], so 
knowledge of the full complement of D5-related ele- 
ments in W22 would be valuable. In the course of this 
study, we found that 9 of 18 D5 or Ds-l elements pre- 
sent in B73 were absent in W22 (e.g., Ds-14- 176 in Fig- 
ure 6B), which precluded a PGR test for excision in the 
Fl. To estimate the fraction of Ds insertion sites in W22 
that are not shared with B73, we proceeded to isolate Ds 
5' junction sequences in W22 by a modified transposon 
display procedure, sequenced a set of random clones, 
identified 25 different junctions, and BLASTed them 
against the B73 genome. Only 13 matched Ds insertion 
sites in B73, (e.g., Ds 1-208, Ds2-3S, Ds-13-21, Ds-14-61, 
Ds-14- 169, and Ds-l4-346), Although only a small sample 
of Ds sites was analyzed, it is clear that a sizable fraction 
of Ds element insertion sites are polymorphic between 
these two inbred lines. 
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Figure 6 PCR assay for >Ac-driven excision of Ds2 and Ds-14 elements. (A) Ds2-21 and Ds2-31 elements. Lanes: 1 and 18, 1-kb ladder; 2 and 
10, W22 inbred (14/x); 3 and 11, B73, pooled seedlings; 4 and 12, B73 plant #1; 5 and 13, B73 plant #2; 6 and 14, wx-m7 plant # 1; 7 and 15, wx- 
m7 plant #2; 8 and 16, Fl (B73-1 x wx-m7-2): 9 and 17, Fl (B73-2 x wx-m7-2). Only the Ac/+ heterozygous Fl individuals (lanes 8-9) showed the 
band expected from somatic Ds2-21 excision. The absence of that band in the wx-m7(Ac) homozygotes (lanes 6-7) can be attributed to the well- 
established negative dosage effect of Ac [59]. No somatic excision of Ds2-31 could be detected in the Ac/+ heterozygous Fl individuals (lanes 
16-17). (B) Ds-14-176 and Ds-I4-199 elements. Lanes: 1, B73, plant #1; 2, wx-m7 plant #2; 3, Fl (B73-1 x wx-m7-2): 4, B73, plant #1; 5, wx-m7 plant 
#2; 6, Fl (B73-1 X wx-m7-2); 7, 100-bp ladder; 8, 1-kb ladder. Ds-/4-176 is polymorphic, i.e., not shared between B73 and W22, and cannot be 
assayed. No somatic excision of Ds-14-199 could be detected in the Ac/+ heterozygous Fl individuals (lane 6). 
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5' end 

Ac CAGGGATGAAAGTAGGATGGGAAAATCCCGTACCGACCGTTATCGTATAACCGATTTTGTTAGTTTTATCCCGATCGATTTCGAACCCGAG 
Ds2-21 TAGGGATGAAAGTAGGATGGGAAAATCCCGTACCGACCGTTATCGTATAACCGATTTTGTTAGTTTTGTCCCGATCGATTTCGAACCCGAG 
Ds 2 - 3 1 TAGGGATGAAAGTAGGATGGGAAAATCTCGTACCGACCGTTATCGTATAACTGATTTTGTTAGTTTTATTCCGATCGATTTCGAACCCGAG 



3' end 

Ac GTCCCGCAAGTTAAATATGAAAATGAAAACGGTAGAGGTAT TTTACCGACCGTTACCGACCGTTTTCATCCCTA 

DS2-21 GTCCCGCAAGTTAAATATGAAAATGAAAACGGTAGAGGCAT TTTACCGACCGTTCCCGACCGTTTTCATCCCTA 

Ds2-31 GTCCCGCAAGTTAAATATGAAAATGAAAACGGTAGAGGTATCCGCAAGTTAAATATGAAAATGAAAACGGTAGAGGTATTTTACCGACCG TTTTCATCCCTA 

Figure 7 Alignment of the 5' and 3' terminal 80 bp from Ac, Ds2-21, and Ds2-31. All three 5' ends align well, except for two SNPs in Ds2- 
21. In contrast, the 3'ends of Ac and Ds2-21 align well, but the 3' end of Ds2-31 is interrupted by a duplication and a deletion. 



Discussion 

Novel Ds-like elements 

We report here a comprehensive analysis of Ds and Ds- 
like elements in the genome of B73, a maize inbred that 
lacks Ac activity (Additional File 1, Tables S2-S5). Two 
novel classes of D5-like elements, Ds-13 and Ds-14, were 
identified in this study. Although they share TIR 
sequences with Ac, Ds-13 elements differ greatly from Ac 
in the subterminal regions. Most Ds-13 elements contain 
sequences from exons 2 and 3, the two largest exons of 
the 807-amino acid Ac transposase, but lack the C ter- 
minus, which is highly conserved among hAT elements 
and correlates with transposase activity [48]. The Ac- 
matching areas of these Ds-13 elements have been incor- 
rectly annotated as belonging to Ac elements in the mai- 
zesequence.org database. 

Ds-14 elements have the least in common with Ac, just 
30 bp at either end, which explains why they had 
escaped detection until the present work, even though 
they comprise the majority of D5-related sequences in 
the genome (54%). Dsl elements, an established mobile 
clade within the Ac/Ds family, also share little in com- 
mon with Ac beyond their TIR sequences. The main dif- 
ferences between Ds-14 and Dsl sequences are: their 
length, the former being about twice as large; the frac- 
tion of elements with 3 or more copies of the AAACGG 
subterminal repeat (438/486 vs. 17/331, respectively), 
and the nature of the non-Ac filler sequence. Dsl ele- 
ments share a common filler which exhibits minimal 
sequence variation [27]. Ds-14 elements, on the other 
hand, have variable filler sequences, although their phy- 
logenetic analysis reveals clusters of related elements 
(Additional File 2, Figure SI). The high copy number 
and high degree of similarity of the filler sequences in 
each cluster make it difficult to identify the original 
source of these sequences in the genome. 

Our algorithm provided for up to 2-bp variation in 
either the element's TIR or the host TSD. To determine 



if the algorithm had missed any elements not flanked by 
a TSD, which is dispensable for transposition [49,50], 
we removed the TSD requirement from our script. The 
new search produced only 11 additional putative ele- 
ments, with flanking host sequence identity ranging 
from 1 in 8 to 5 in 8. There were 3 Dsl, 1 Ds2, 2 Dsl3, 
and 5 not clearly related to any of the classes identified 
in this study. The ends of these 5 elements are well con- 
served, so probably all of them represent Ds elements at 
various stages of decay. This exercise suggests that we 
have essentially defined the entire Ac-Ds family in 
maize. 

Transposition of Ds and Ds-I elements 

The hexameric repeat sequence AAACGG located in 
the STR of Ac is necessary for transposition [8,22]. All 
Ds2, 37 Ds-13, and 438 Ds-14 elements possessed hexam- 
ers in numbers judged sufficient for transposition [23]. 
However, none of the Ds-l elements tested by our PGR 
assay showed evidence of somatic excision in the pre- 
sence of Ac, whereas the tested Ds2 elements did. The 
most likely explanation for this negative result is that 
the 5' and 3' ends of Ds-13 and Ds-14 elements are too 
defective to be mobilized by Ac. Alternatively, these ele- 
ments may reside in hypermethylated DNA, which has 
been shown to inhibit Ac transposition [51]. The methy- 
lation status of Ds sequences was not analyzed, as this 
was beyond the scope of the present study. Multiple 
copies of the hexameric repeat may not be the key 
requirement for Dsl mobility because some known 
mobile Dsl elements contain just one copy of the repeat 
[11,27]. Of the 331 Dsl elements in the B73 genome, 
314 lack them. Kunze and Starlinger [8] have empha- 
sized that the lack of these hexamers does not exclude 
an element from being able to transpose. Dsl elements 
are presumed to be transposable because other subterm- 
inal sequences may also interact with the transposase 
complex. 
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The occurrence of LTR retrotransposon sequences in 
Ds and Ds-like elements is not surprising because these 
sequences constitute more than three quarters of the 
maize genome [18]. An answer to the question of 
whether or not their increased size renders these ele- 
ments nontransposable can be sought in the existing lit- 
erature. There are several examples of Ds elements with 
large, unrelated stretches of DNA that can still trans- 
pose, provided an Ac element is present in the genome 
[52-54]. More recently, a macrotransposon with a Ds 
element at one end and an Ac element at the other was 
shown to mobilize as much as 100 kb of intervening 
DNA consisting mostly of nested retrotransposons [55]. 
Thus, it is unlikely that increased size alone makes Ds-l 
elements incapable of tranposition. 

Gene capture 

Several different types of DNA transposons, including 
the recently discovered Helitrons, are known to pick up 
fragments of genes from the genome. However, there 
are no reports of gene fragment capture by hAT ele- 
ments in the literature. Here, we identified 2 Ds2 and 9 
Ds-14 elements containing gene fragments. How excisive 
DNA transposons acquire gene fragments is not known, 
but it has been suggested that host sequence capture 
may result from DNA replication errors during repair of 
the double strand breaks caused by transposon excision 
[56]. 

The notion that gene capture proceeds via genomic 
DNA, as in all other DNA transposons, is supported by 
the two cases of Ds2 elements with trapped gene 
sequences. In the Ds2-22 element, which carries a frag- 
ment of an lAA amino acid hydrolase ILRl-like 3 gene, 
the intron sequence between exons 5 and 6 is present, 
arguing against capture from an mRNA. Ds2'22 is on 
chromosome 9, whereas the donor gene is on chromo- 
some 2 (Figure 4.). The second Ds2 element is more 
intriguing. A homologous transcript of unknown func- 
tion, with putative start and stop codons, can be found 
in the maize mRNA sequence database. Ds-14 elements 
can also carry database-supported, transcribed genomic 
sequences containing introns. For some of them, the 
donor gene can be identified. An example of a captured 
sequence from a known gene is provided by Ds-l4-354, 
with the element on chromosome 8 and the donor gene 
on chromosome 9 (Figure 4). However, putative donor 
gene sequences for other fragments could not be found 
in the current version of the B73 genome, suggesting 
that the corresponding Ds-14 transcripts in the databases 
may represent nonfunctional transcriptional baggage. 

Ds content of different inbreds 

The Ds content of the inbred W22, used in its 7 ver- 
sion as the Ac source for the Ds-14 excision assay, was 



surveyed in this study. Of 25 Ds or Ds-l junctions isolated 
from W22, only one-half (13) were shared by B73. Though 
these two inbreds fall into different heterotic groups [57], 
they represent germplasm adapted to the U.S. Corn Belt. 
We infer that, as has been found for retrotransposons 
[58], the content of most DNA transposons will be highly 
polymorphic among maize lines. 

We can also compare the distribution of Ds elements 
accumulated over time in B73 with that of recently trans- 
posed Ds elements in W22, the inbred used by VoUbrecht 
et al. [15] in their studies. Additional File 3, Figure S2 
shows the distribution of Ds elements in each pseudomo- 
lecule of the B73 genome. While a statistical comparison 
is unwarranted because of the smaller number of ele- 
ments in our study (no chromosomal region has >10 
insertions), a visual comparison reveals similar trends: 
low concentration of elements in the centromeric regions 
and high concentration of elements in the distal tips of 
some arms, particularly IL, 2S, SL, 7L, and 8L, 

Conclusion 

Transposons, recognized today as ubiquitous compo- 
nents of eukaryotic genomes, were first identified six 
decades ago by their disruption of chromosomal integ- 
rity and their mutagenic effects on genes. The elegant 
genetic analysis of gene mutations that served as repor- 
ters for their movement served to elucidate many trans- 
poson properties, but the complexity of transposon 
families has only been revealed by whole genome 
sequencing projects. Here we report on an in-depth ana- 
lysis of all the sequences in the maize genome that are 
related to Dissociation {Ds), the first transposon ever 
described. Our bioinformatic approach has uncovered 
over 900 D5-related sequences, including two novel 
classes that account for the majority of elements in the 
family. These novel elements have diverged significantly 
from Ds and, though clearly related to Ds, do not appear 
to transpose in today's maize genome. Unlike actively 
transposing elements, many of the D5-related elements 
are inserted in repetitive DNA, where they probably 
become immobile and begin to decay. A second maize 
inbred line shared only half of its Ds insertion sites with 
B73, suggesting that present-day maize inbred lines dif- 
fer greatly in their DNA transposon make-up. 

Additional material 



Additional file 1: Supplemental tables. Table SI. PGR primers used to 
assay Ds and Ds-l somatic excisior^s. Table S2. Description of Dsl 
elements. Table S3. Description of Ds2 elements. Table S4. Description of 
Ds3 elements. Table S5. Description of Ds4 elements. 

Additional file 2: Figure SI. - Phylogenetic tree of 456 Ds-14 elements. 
All Ds-14 elements other than the 30 containing insertions of known 
transposons, were aligned by the high-speed multiple sequence 
alignment program MAFFT (Multiple Alignment using Fast Fourier 
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Transform) http://www.ebi.ac.uk/Tools/msa/mafft/. The phyloger^etic tree 
with the shortest branch lengths was built by Jalview software using the 
neighbor joining algorithm. Three major clusters of filler sequences are 
identified. There are 57 elements in cluster 1 which is the top part of the 
phylogenetic tree and 60 elements in cluster 3, the bottom part of the 
tree. These clusters are relatively more divergent than cluster 2, which is 
the largest one with 339 elements in the middle of the phylogenetic 
tree. 

Additional file 3: Figure S2. Distribution of Ds elements in each of the 
10 B73 pseudomolecules. The X axis shows the length of each 
chromosome in megabases (Mb) and the y axis shows the number of Ds 
insertions in each 5-Mb bin. Approximate centromere positions are 
indicated with a black circle. 
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